Close

1. Identity statement
Reference TypeConference Paper (Conference Proceedings)
Sitesibgrapi.sid.inpe.br
Holder Codeibi 8JMKD3MGPEW34M/46T9EHH
Identifier8JMKD3MGPEW34M/45CKPKE
Repositorysid.inpe.br/sibgrapi/2021/09.04.20.35
Last Update2021:09.04.20.35.48 (UTC) administrator
Metadata Repositorysid.inpe.br/sibgrapi/2021/09.04.20.35.48
Metadata Last Update2022:06.14.00.00.25 (UTC) administrator
DOI10.1109/SIBGRAPI54419.2021.00052
Citation KeySantosSiDaRoDrDu:2021:FoUnAp
TitleA Form Understanding Approach to Printed and Structured Engineering Documentation
FormatOn-line
Year2021
Access Date2024, May 06
Number of Files1
Size34470 KiB
2. Context
Author1 Santos, Gabriel Lavoura dos
2 Silva, Vanessa Telles da
3 Dalmolin, Laura de Aguiar
4 Rodrigues, Ricardo Nagel
5 Drews Jr, Paulo Lilles Jorge
6 Duarte Filho, Nelson Lopes
Affiliation1 Universidade Federal do Rio Grande, Brazil 
2 Universidade Federal do Rio Grande, Brazil 
3 Universidade Federal do Rio Grande, Brazil 
4 Universidade Federal do Rio Grande, Brazil 
5 Universidade Federal do Rio Grande, Brazil 
6 Universidade Federal do Rio Grande, Brazil
EditorPaiva, Afonso
Menotti, David
Baranoski, Gladimir V. G.
Proença, Hugo Pedro
Junior, Antonio Lopes Apolinario
Papa, João Paulo
Pagliosa, Paulo
dos Santos, Thiago Oliveira
e Sá, Asla Medeiros
da Silveira, Thiago Lopes Trugillo
Brazil, Emilio Vital
Ponti, Moacir A.
Fernandes, Leandro A. F.
Avila, Sandra
e-Mail Addresslavourasantos@gmail.com
Conference NameConference on Graphics, Patterns and Images, 34 (SIBGRAPI)
Conference LocationGramado, RS, Brazil (virtual)
Date18-22 Oct. 2021
PublisherIEEE Computer Society
Publisher CityLos Alamitos
Book TitleProceedings
Tertiary TypeFull Paper
History (UTC)2021-09-04 20:35:48 :: lavourasantos@gmail.com -> administrator ::
2022-03-02 00:54:15 :: administrator -> menottid@gmail.com :: 2021
2022-03-02 13:36:21 :: menottid@gmail.com -> administrator :: 2021
2022-06-14 00:00:25 :: administrator -> :: 2021
3. Content and structure
Is the master or a copy?is the master
Content Stagecompleted
Transferable1
Version Typefinaldraft
Keywordsform understanding
text detection
spatial layout analysis
AbstractA significant amount of companies still depends on printed documents, such as healthcare reports, engineering specifications, or historical documents. Those documents are diverse in terms of layout and content, thereby it requires different approaches for each document structure, which makes information extraction a costly and inefficient task. We classify documents into three categories, non-structured, semi-structured, and structured documents. The last one being the focus of the present work.We propose a pattern recognition method for structured documents with an anchoring relationship between question-answer objects through a system of hypotheses and a probability distribution in order to identify which predefined model the document belongs to. Therefore, acting as a system for both identification and content extraction to structured documents. The method has promising results for pattern recognition from all document models, with 78% to 97% objects extracted correctly.
Arrangement 1urlib.net > SDLA > Fonds > SIBGRAPI 2021 > A Form Understanding...
Arrangement 2urlib.net > SDLA > Fonds > Full Index > A Form Understanding...
doc Directory Contentaccess
source Directory Contentthere are no files
agreement Directory Content
agreement.html 04/09/2021 17:35 1.3 KiB 
4. Conditions of access and use
data URLhttp://urlib.net/ibi/8JMKD3MGPEW34M/45CKPKE
zipped data URLhttp://urlib.net/zip/8JMKD3MGPEW34M/45CKPKE
Languageen
Target FileSibgrapi_2021 - Paper ID 64.pdf
User Grouplavourasantos@gmail.com
Visibilityshown
Update Permissionnot transferred
5. Allied materials
Mirror Repositorysid.inpe.br/banon/2001/03.30.15.38.24
Next Higher Units8JMKD3MGPEW34M/45PQ3RS
8JMKD3MGPEW34M/4742MCS
Citing Item Listsid.inpe.br/sibgrapi/2021/11.12.11.46 5
Host Collectionsid.inpe.br/banon/2001/03.30.15.38
6. Notes
Empty Fieldsarchivingpolicy archivist area callnumber contenttype copyholder copyright creatorhistory descriptionlevel dissemination edition electronicmailaddress group isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder schedulinginformation secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url volume


Close